feat(minimax): add MiniMax provider with tier-aware rate limiting#84
feat(minimax): add MiniMax provider with tier-aware rate limiting#84Societus wants to merge 6 commits intorepowise-dev:mainfrom
Conversation
- Add litellm to interactive provider selection menu - Support LITELLM_BASE_URL for local proxy deployments (no API key required) - Auto-add openai/ prefix when using api_base for proper LiteLLM routing - Add dummy API key for local proxies (OpenAI SDK requirement) - Add validation and tests for litellm provider configuration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… false positives Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add first-class support for Z.AI with OpenAI-compatible API. - New ZAIProvider with thinking disabled by default for GLM-5 family - Plan selection: 'coding' (subscription) or 'general' (pay-as-you-go) - Environment variables: ZAI_API_KEY, ZAI_PLAN, ZAI_BASE_URL, ZAI_THINKING - Rate limit defaults and auto-detection in CLI helpers Closes repowise-dev#68
Add RATE_LIMIT_TIERS class attribute and resolve_rate_limiter() static method to BaseProvider. Any provider with subscription tiers can define RATE_LIMIT_TIERS and pass tier + tiers to resolve_rate_limiter() to get automatic tier-aware rate limiter creation. Precedence: tier > explicit rate_limiter > None. Tier matching is case-insensitive. Invalid tiers raise ValueError. This is a provider-agnostic foundation -- no provider-specific code. Providers adopt it by defining RATE_LIMIT_TIERS and calling resolve_rate_limiter() in their constructor. Ref: repowise-dev#68
Add MiniMax as a built-in provider using the generic tier framework (repowise-dev#82). MiniMax is an OpenAI-compatible API provider with the M2.x model family (M2.7, M2.5, M2.1, M2) and published token plan rate tiers. Changes: - New MiniMaxProvider with RATE_LIMIT_TIERS (starter/plus/max/ultra) derived from published 5-hour rolling window limits - Uses resolve_rate_limiter() from BaseProvider for tier resolution - reasoning_split=True by default to separate thinking from content - Bumped retry budget: 5 retries / 30s max for load-shedding tolerance - Registered in provider registry with openai package dependency hint - Conservative PROVIDER_DEFAULTS (Starter-tier: 5 RPM / 25K TPM) - CLI env vars: MINIMAX_API_KEY, MINIMAX_BASE_URL, MINIMAX_REASONING_SPLIT, MINIMAX_TIER - 30 unit tests (constructor, tiers, generate, stream_chat, registry) Rate limit tiers (from https://platform.minimax.io/docs/token-plan/intro): Starter: 1,500 req/5hrs -> 5 RPM / 25K TPM Plus: 4,500 req/5hrs -> 15 RPM / 75K TPM Max: 15,000 req/5hrs -> 50 RPM / 250K TPM Ultra: 30,000 req/5hrs -> 100 RPM / 500K TPM Highspeed variants (e.g., MiniMax-M2.7-highspeed) share the same rate limits as their base plan -- the difference is faster inference, not quota. This provider is structurally identical to Z.AI (repowise-dev#83) and was trivial to implement because both use the generic tier framework. The framework eliminated all per-provider boilerplate for tier resolution. Depends on: repowise-dev#82 (generic tier framework) Ref: repowise-dev#68
swati510
left a comment
There was a problem hiding this comment.
missing zai and minimax in providers/llm/init.py, registry.py docstring got updated, it didnt
| console.print(f" [{WARN}]Skipped. Please select another provider.[/]") | ||
| return interactive_provider_select(console, model_flag, repo_path=repo_path) | ||
| # Special case: litellm local proxy doesn't need an API key | ||
| if chosen == "litellm" and os.environ.get("LITELLM_BASE_URL"): |
There was a problem hiding this comment.
this branch is unreachable — _detect_provider_status (L417-420) already marks litellm as detected when LITELLM_BASE_URL is set, so we never enter the outer if chosen not in detected with this combo.
| @@ -268,18 +268,22 @@ def print_phase_header( | |||
| "litellm": "groq/llama-3.1-70b-versatile", | |||
| } | |||
There was a problem hiding this comment.
zai and minimax are wired in helpers.py, validate_provider_config, and the registry but not here , they won't show up in the interactive init menu. please add them to _PROVIDER_DEFAULTS, _PROVIDER_ENV, and _PROVIDER_SIGNUP
| """ | ||
|
|
||
| def __init__( | ||
| self, |
There was a problem hiding this comment.
since this PR introduces the tier framework on BaseProvider, should zai adopt it too? lite/pro/max have published limits. ok to defer but feels odd to land the framework and only wire minimax
swati510
left a comment
There was a problem hiding this comment.
Looks like this is stacked on #83, so the base.py/registry/zai changes are shared. Assuming #83 lands first this is fine, just calling it out.
Three things:
-
My earlier note about _PROVIDER_DEFAULTS / _PROVIDER_ENV / _PROVIDER_SIGNUP in cli/ui.py still stands, zai and minimax are invisible in the interactive init menu. Worth fixing here since this PR ships both.
-
MiniMax rate limits are published as 1500 requests / 5 hours. Our RateLimiter is a 60-second sliding window. Converting to ~5 RPM is a reasonable steady-state approximation but a user who bursts will see spurious 429s locally, and one who paces slowly can technically exceed quota without tripping our limiter. Fine to ship as-is, but leave a comment acknowledging the window mismatch so nobody chases a ghost bug later.
-
MINIMAX_REASONING_SPLIT is parsed as .lower() == "true" in two different branches of helpers.py. Extract a tiny _env_bool helper and accept the usual truthy values (1/yes/on) since that's what users reach for.
| if os.environ.get("MINIMAX_BASE_URL"): | ||
| kwargs["base_url"] = os.environ["MINIMAX_BASE_URL"] | ||
| if os.environ.get("MINIMAX_REASONING_SPLIT"): | ||
| kwargs["reasoning_split"] = os.environ["MINIMAX_REASONING_SPLIT"].lower() == "true" |
There was a problem hiding this comment.
Same .lower() == "true" parsing is copy-pasted at line 357 in the auto-detect path. Extract into _env_bool(name, default=False) and reuse. Also accept 1/yes/on, that's what users will type.
…ta warning, clean dead branch Apologies for the oversight -- these provider dict entries were mostly in place during development but got lost assembling the PR stack. - Add zai and minimax to _PROVIDER_DEFAULTS, _PROVIDER_ENV, and _PROVIDER_SIGNUP so they appear in interactive init - Extract _env_bool(name, default=False) helper accepting 1/yes/on/true and reuse for MINIMAX_REASONING_SPLIT parsing in both code paths - Add session_request_warn to RateLimitConfig: logs a warning when cumulative session requests exceed a threshold, giving users advance notice before hitting long-window provider quotas (e.g. MiniMax's 1500 req/5hr) - Remove unreachable litellm local-proxy branch (L488): _detect_provider_status already marks litellm as detected when LITELLM_BASE_URL is set, so the guard at L483 makes it unreachable - Add note about MiniMax 1500req/5hr vs our 60s window approximation Addresses review feedback from @swati510 on repowise-dev#84.
|
This is currently stacked on #83 and re-includes all of #83's framework code verbatim. Once #83 lands, please rebase so this PR shrinks to just the MiniMax-specific delta (provider, env vars, registry entry, tests). It will be much easier to review at that point. While you're rebasing, please also pick up the four items below so we can close everything out in one shot:
Once #83 is merged and this is rebased + the four items above are addressed, this is ready to land. |
Summary
Add MiniMax as a built-in LLM provider using the generic tier framework from #82.
This PR is a straightforward application of the same pattern as #83. Both MiniMax and Z.AI are OpenAI-compatible APIs with subscription tiers and built-in reasoning models. The generic tier framework made this provider almost mechanical to implement -- the only provider-specific code is the model names, the
reasoning_splitparameter vs Z.AI'sthinkingtoggle, and the tier definitions.Depends on: #82 (generic tier framework -- merge that first)
Why This Was Inconsequential
MiniMax shares the same architectural profile as Z.AI:
https://api.minimax.io/v1openaiSDKThe generic framework from #82 eliminated all boilerplate for tier resolution. Adding MiniMax was just: define
RATE_LIMIT_TIERS, set the base URL, and pick the reasoning parameter name. Everything else is inherited.Changes
New: MiniMax Provider (
minimax.py)RATE_LIMIT_TIERSwith Starter/Plus/Max/Ultra configs from published limitsresolve_rate_limiter()from BaseProvider (zero custom tier code)reasoning_split=Trueby default (separates thinking from content)Registry (
registry.py)minimax->MiniMaxProviderwithopenaipackage hintRate Limiter (
rate_limiter.py)PROVIDER_DEFAULTS["minimax"]= Starter-tier conservative (5 RPM / 25K TPM)CLI Helpers (
helpers.py)MINIMAX_API_KEY,MINIMAX_BASE_URL,MINIMAX_REASONING_SPLIT,MINIMAX_TIERenv varsMINIMAX_API_KEYTests (
test_minimax_provider.py)Rate Limit Tiers
From published MiniMax docs (5-hour rolling window):
Highspeed variants (e.g., MiniMax-M2.7-highspeed) share the same rate limits as their base plan. The difference is model selection (faster inference), not quota.
Ref: https://platform.minimax.io/docs/token-plan/intro
Configuration
Test Plan
uv run pytest tests/unit/test_providers/test_minimax_provider.py -v # 30 passedAll 121 provider tests pass with zero regressions.
PR Stack
Related